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ABSTRACT 


A short presentation is made on the basic elements of Power 
Spectral Analysis with emphasis on the Blackman-Tukey method. Short 
discussions are included on the topics of pre -whitening, frequency 
and spectral windows, and statistical reliability. Examples are 
included whenever possible, and a Fortran subroutine for calculating 
a power spectrum is presented. 
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I. BRIEF BACKGROUND 
A. Historical Roots 

Power spectral analysis is one small area of the much broader 
field of Communications Theory. This broader field is an indispens- 
able part of communications engineering and provides the theoretical 
foundations for the design and analysis of much of the advanced engi- 
neering found in modern communications systems (Lee, i960). The 
theory is basically a statistical theory in which the central idea is 
that noise and messages are considered to be random phenomena. Prob- 
ability theory is therefore incorporated into the very foundation of 
the theory and is an integral part of it. 

The basis of the theory finds its roots in statistical mechanics. 
The equivalence of time and ensemble averages, first assumed by Gibbs 
and later stated more precisely by Maxwell in his ergodic hypothesis , 
is the starting point for statistical communications theory. From 
this and the quasi-ergodic hypothesis are derived the formal proofs 
necessary for the logical development of the subject. 

B. Ergodicity and Stationarity 

Two conditions necessary for the development of the theory are 
imposed on the random ensembles of data which we wish to power spec- 
trum analyze. We shall merely state them as being the foundation for 
the development of the theory, with proofs and implications to be 


found elsewhere. 



2 


(1) Ergodicity . The ergodic theorem may be stated as "in a 
stationary ensemble of random functions having a continuous range of 
possible values the amplitudes of an ensemble member will come infin- 
itely close to every point of the continuous range of possible values 
if given an infinite amount of time." This theorem allows the replace- 
ment of ensemble averages with time averages, and is the basis for the 
formal analysis of communications theory. 

(2) Stationarity . If the amplitude probability density of an 
ensemble is time independent, the ensemble is said to be stationary . 

In practical terras related to power spectral analysis this means that 
the power spectrum of a finite data set is time independent. 

In addition to the two above conditions, it is assumed that 
the random process is Gaussian or nearly Gaussian in character, that 
is, the probability distribution of the elements in an ensemble is 
Gaussian or nearly so. Blackman and Tukey (1958) show that for an 
infinite data set a Gaussian assumption yields exact results, and 
rather good approximations otherwise. 

C. Notation 

With these preliminaries stated, we now give the notation to 
be used throughout the rest of this section. Fourier transforms, 
correlations (auto- and cross-), and convolutions are used rather 
frequently in power spectrum work. Therefore to reduce the complexity 
of the equations in the following discussion a simplified notation 
will be adopted as follows: 
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1. Fourier Transform 

Let f(t) be specified on the interval ( “ COj CD ) . Then define 
the Fourier transform 


F(co) 


-i- r f(t) e- 1 "* at 

ygs — 


We shall also use the Fourier transform operator e.g., 


y[f(t)] = f(») 


and inverse Fourier transform operator 


rhtit)! = -i- r f(t) e iuJt dt . 
/2tt 


2. Correlation Functions 

Let f (t) and f (t) be specified on the interval (-«, «). We 
then use the following notation: 
a. Convolution. Define 


"12 


(t) = lim i r T /£ f, (t) 


T-^os 


T J-T/2 l' 


f 2 (t - t) dt 


s f x (t) * f 2 (t) 


k 


where * denotes convolution. 

b . Cross Correlation. Define 


V T > ' l%% f i (t) f 2 (t + t) dt 


= f x (t) * f 2 (-t) 


c. Autocorrelation . Cross correlation of a function with 
itself. Define 


% •. ( T ) = liltt is 


r ll 


T “>00 


pT/2 

J-T/2 


f x (t) f x (t + t) dt 


= f x (t) * f x (-t) 

= f x (t) * f x (t) , 

the last step being a result of the symmetry of the autocorrelation 
operation. 

In general, then, small letters will denote functions of time 
and capital letters Fourier transforms (functions of frequency) of 
the corresponding functions of time. Letters written in script will 
be used to denote operators. Additionally, MLP will be used as an 
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abbreviation for mean-lagged-product, and FFT for fast-Fourier- 
transform. The term "power-spectral-density" (PSD) will also be 
used synonymously with "power spectrum". 
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II. THE POWER SPECTRUM FOR THE CONTINUOUS CASE 


A Fourier series is one set of orthogonal functions that may 
be used to expand a we ll -behaved function on the interval (- 00 , °°). 
If we consider a time series f(t), we can represent it by 

f(t) = ^[FW] 


where 


F(«) = Wt)] . 

The power spectrum of a function is defined as the absolute value 
squared of the Fourier transform of the function. If P(co) is the 
power spectrum, then 


p(u>) = |3 f [f(t)] I 2 = |p((n) | 2 . 


But because of the convolution theorem, which states that for two 
functions f^ and f^. 


* fg] = f 1 


F 


2 
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By setting f^ = f^ we have 


* f ] = F • F 
1 1 1 1 1 

(2) 

Now f^ * is the autocorrelation function for f^. 

tion function is always an even function, i.e., 

The autocorre la- 

f x (t) * f L (-t) = f L (t) * f L (t) 


so that there are no sine components in its Fourier 
transform is therefore real and 

transform. The 

* [f l * f l ] * - *11 


= f f = If I 2 
i l 1 l 1 

(3) 

so that 


P(a)) = |f(h>) \ 2 . 

(4) 
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Therefore 

P(od) = *(«) » ^ n (t)l (5) 

'i 

or 

■'>(«>) I 2 = ^(w) • (6) 

This equality is known as the Wiener Theorem and states that 
the power spectrum of a time series is equal to 

(a) the Fourier transform squared of the function, or 

(b) the Fourier transform of the autocorrelation function. 

It may be noted that in (a) the Fourier transform utilizes both 
sine and cosine terms, while in (b) the Fourier transform of the 
autocorrelation function has only non-zero cosine terms, that is, 

* (<d) = Sfojj/t)] 

- — — f 00 cp (t) cos cot dt 
11 



<p u (o 


co.; cot dt 


(T) 
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The first method of computing the power spectrum is the most 
direct and straightforward way. The fast-Fourier-transform (FFT) 
calculation utilizes the definition of the power spectrum directly 
to compute the power spectrum. The second method is known as the 
mean-lagged~product (MLP) method, and is the one utilized by Blackman 
and Tukey (1958) to calculate power spectra. 


I 



III. FINITE DATA SETS 


A. Effect of Data Truncation on the Power Spectrum 
The definition of the power spectrum was made for infinitely 
long, continuous data sets. Practical data analysis, however, re- 
quires the use of considerably less data. To determine the effects 
on the power spectrum resulting from the truncation of a data set 
to finite lengths, consider the function f(t) defined on the inter- 
val (-■», oo) as shown in Figure 1. 


♦ 



Figure 1 


■> 


If we truncate f(t) by multiplying by a data window g(t) such that 


g(t) = 


1, |t | £ T/2 
0, it | > T/2 


the truncated data set becomes 


h(t ) = f(t) g(t) 
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as shown in Figure 2. 


i 

h(t) = f(t) g(t) 

■ ■ ,J 

< 




lT/2 T/2 


Figure 2 

The function h(t) then represents the truncated data set available 
from which the power spectrum is to be calculated. The power spectrum 
p ap (“>) 0f representing the apparent power spectrum of f(t), is 

then 


P ap (u>) = |J(f • g)| 2 
= |f * g| 2 

= |f| 2 * |g| 2 , (8) 

the last step resulting from the fact that g is presumed to he an 
even function. Thus the true power spectrum 

W"' - 
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O 

is modified, by convolution with |G| , the Fourier transform squared 
of the data window. G is called the frequency window by Blackman and 
Tukey (1958). For the data set shown in Figure A-2, G is the sine 
function, shown in Figure 3. 



Convolution by the frequency window causes a certain degree of 
smoothing in the calculated power spectrum and a small amount of 
leakage via the side lobes from nearby frequency bands into the fre- 
quency band of interest. If the computed power spectrum is relatively 
flat, i.e., has a dynamic range of less than 2 or 3 orders of magni- 
tude, this smearing or leakage causes little or no problem, for the 
amplitude of the largest (first) side lobe of the sine function is 
only on the order of several percent of that of the main lobe. If, 
however, there are large, well-defined peaks in the power spectrum 
such large peaks act as a first approximation delta function. When 
convolved with the frequency window they act to reproduce the win- 
dow, producing spurious peaks in the power spectrum corresponding to 
the side lobes of the frequency window. These spurious peaks may 
be mistakenly identified as structural details of the true power 
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spectrum when in reality they are merely artifacts created by trunca- 
tion of the original data set. 

B, Reducing Frequency Window Leakage 
Because of the frequency window side lobe leakage associated 
with data function truncation, one is properly concerned with means 
of reducing or minimizing the undesirable effects of such truncation. 
Several methods are available for doing this, each depending somewhat 
on the particulars of how the power spectrum is computed. Three 
commonly used methods will be described in the following sections, 
each being intended to illustrate the basic features of and rationale 
behind each procedure. 


1. Data Tapering 

One of the obvious methods of reducing the side lobe leakage 
is to choose the data window g(t) such that its Fourier transform 
G(<ju) has either no side lobes at all or side lobes that are small 
compared to the main lobe. Exartgcles of the former are: 

-(|t|/2t f 

g(t) = e (Gaussian) 

where t is some scale factor. This function has a Fourier transform 
o 

equal to a Gaussian; and 

,, sin 2tt Af t 
g(t ) 


2tt qf t 


(sine function) 
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which has a Fourier transform equal to the box-car function. 

However, for the side lobes to be eliminated altogether it is 
necessary to extend these data windows to ± If they are truncated 
at some finite value (as they must be in practice), side lobes are 
again introduced, though they will be smaller than the case where 
g(t) is the box-car function. 

A simpler method is to taper the data set at the ends using 
data windows indicated in the following sketches: 



or 



Although side lobes are still present in the frequency windows 
for each of the above cases, they will be smaller than for the case 
where the simple box-car data window was used. The smaller side lobes 
are a result of replacing the abrupt discontinuities of the original 
box-car data window with more gently sloping functions. 
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Data tapering is most often performed when the FFT method is 
used to compute the power spectrum. For a more complete discussion 
of this topic see Enochson and Otnes (1968). 

2. Tapering the Autocorrelation Function 
If the MLP method of computing the PSD is used the problem of 
side lobe leakage is not as simple as it was in the case of computing 
the PSD directly from its definition. An additional factor for con- 
sideration enters when not only the original data set must be trun- 
cated, but so also the autocorrelation function. When an MLP cal- 
culation is made it becomes advisable, for reasons to be discussed 
in connection with statistical reliability, to truncate the auto- 
correlation function at a maximum lag of not more than 10—20$ of 
the length of the data set (Blackman and Tukey, 1958). In order to 
see what effect this second truncation has, consider the original 
data set f(t) properly truncated with a data window g(t). The auto- 
correlation function cc^(t) then becomes 

cP i:l (t) = [f(t) * g(t)] * [f(t) • g(t)] 


or 


^11 = ( f * §) * (f ' g) 



1 6 


Now 


cp represents the entire autocorrelation function avail- 

^11 ' r, /% Oz-Vrf 


a w e after f is truncated by 


ta truncate <P, , at a lag of 10 2C$ 


of the length of the data set therefore involves specifying another 
function q( T ), called the ^window, hy which qx^ is multiplied 
in order to effect truncation. Let cp^ represent the truncated 
autocorrelation function. Then 


eWf s>J • * • 

The apparent power spectrum is the Fourier transform of the 
truncated autocorrelation function so that 

p ftn) = S{t(f • g) * (f • 8)1 ' <lJ 

ap 

= ?[(f • g) * (f • g)] * ? (c) 

= \?{t . g) • y(f • g)l * ? (q) 

= [(F * G) ' (F * G)] * Q t ^ 


where Q (the so-called spectral window) is the Fourier transform of 
q, the lag window. Now g and q are normally chosen to be even 

i\inctions y so that 
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P ap (<u) - ( !f | 2 * |g| 2 ) * Q 

= CP true (“>) * b(«*») I 2 ] * Q • (10) 

We see that, as in section III- A, the true power spectrum has been 
altered by convolution with |g(u>) | 2 , and additionally by convolution 
with Q. If q(t) was the box-car function, then Q(to) is the familiar 
sine function, not squared. It is this last convolution that can 
lead to negative values in the apparent power spectrum even though 
negative values are theoretically impossible in a power spectrum. 
They are, in this case, merely an artifact due to truncation. 

To remove the possibility of negative values for the power 
spectral estimates, as well as making the side lobes of the window 
Q smaller, it is once again advisable to tailor the shape of 
the lag window q(x). It is to be tailored in such a way that the 
side lobes that remain are small in comparison to the main lobe, and 
damp out quickly with increasing distance from the main lobe. Two 
lag windows that have been found to do this effectively are: 


( 1 ) 



l TTT I 

+ cos — 
1 , 



£ T £ T 

m 


where T^ is the greatest lag used in the autocorrelation function. 
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The use of this window is called "harming". A second, more commonly 
used lag window is 


(2) q 2 (x) = 0.5*+ + 0.1+6 cos p , - T m < T 5 T m . 

m 

Use of this window is called "hamming". 

The two functions and along with their associated 
transforms and 511,6 shown in Figure 4. 



Figure 4 

(After Blackman and Tukey, 1958.) 
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Although in principle q( T ) would normally be applied to the 
autocorrelation function prior to transforming, the interchange- 
ability of the integrations associated with the transform and the 
convolution makes it possible to convolve with Q(< u ) after the 
transform has been completed. This has particular advantages when 
the data are in digital form. In the case of hamming or hanning, 
the convolution will take the form of a 5-point smoothing formula 
easily applied to the power spectrum estimates calculated by trans- 
forming the autocorrelation function. More will be said about this 
form of smoothing in section V when a specific power spectrum example 
is given. 

It might be added that when the MLP method for computing PSD's 

is used it is usually unnecessary to be overly concerned about the 

2 

effects of the convolution of |Q(,„)| with p true (rn) [see Eq. (lO)]. 

p 

Since G(u)) is squared, whereas Q( (U ) is not, the side lobes of |G( (I) )| 
will be unimportant compared to those of Q(u))> The half-width of the 
major lobe of G(cu) is also small (~ 1 order of magnitude smaller) 
compared to that of Q(<«) since the original data set is ~ 10 t im es 
longer than q( T ) Therefore the effects of Q will far outweigh those 
of G, and it is those effects that hamming or hanning are designed 
to offset. It is accordingly not necessary to taper the original data 
set (or pre-whiten; see next section) except in cases of extremely 
discontinuous spectra, or when extreme care need be taken to assure 
the validity of the results. 
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3. Pre-whitening 

As was mentioned previously, if there are large, well-defined 
peaks in the power spectrum such peaks can produce spurious detail 
in the power spectrum due to frequency or spectral window side lobe 
leakage. It is for this reason that methods were sought to reduce 
the side lobes. An alternate method that can be used on occasion is 
known as "pre-whitening". This procedure involves flattening the 
power spectrum prior to calculation by passing the data through a 


filter with a known power transfer function in order to eliminate 

large peaks and discontinuities. With the peaks and gross discontin- 

uities thus removed, the effective convolution of |Fj with |G| to 

2 

yield P (,„) will not act to reproduce the side lobes of |G| as 

3»P 

would have been the case had the peaks not been removed. Once the 
power spectrum is computed the inverse of the pre -whitening, the 
so-called "post-darkening", is applied to the PSD estimates to 
complete the calculation. 

In order for pre-whitening to be used effectively some prior 
knowledge of the expected shape of the spectrum to be flattened must 
be available. This is necessary in order to design a filter with 
the proper power transfer function. An example of how pre -whitening 
may be used is the following: Suppose we have a data set for which 

we know approximately the shape of the power spectrum, and suppose it 
has a very large low (zero) frequency component, as in Figure 5. 
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3 

2 

log P(a>) 

1 
0 

cu 

Figure 5 

If the power spectrum is computed directly from the data with- 
out some measures being taken to compensate for possible side lobe 
leakage, the apparent power spectrum p a p( UJ ) would look something like 
Figure 6. 

3 
2 

i°g P ap (to> 

1 
0 

co 




Figure 6 
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The small scale structure may well be that solely due to side 
lobe leakage, though in the case where confidence in the results is 
low this structure can be masked by statistical noise. 

Suppose now that the original data f(t) is pre-whitened by 
convolution with a smoothing function s(t). Then the data set h(t) 
available from which the PSD calculation is made will be 

h(t) = f(t) * s(t) 


or more simply, 
h = f * s 


Then 


P ap (tt>) = i?(h).| 2 = |*(f * s) | 2 



y 


or 


P ap («0 = If 2 I • Is | 2 


" P true^ * I 3 !" 


( 11 ) 
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since s(t) is an even function. 

Thus, if the original data set is convolved with a smoothing 
function s(t), the true PSD is modified by the direct product of the 
square of the transform of s(t). js | 2 is called the power transfer 
function. To compensate for this at the end of the calculation we mul- 
tiply by the inverse of |s(u>) | 2 to restore the proper shape to P(u>). 

In the above example suppose we choose a smoothing function 




1 + cos 


rr(t - t ) 
o 


2A t 


s(t) = 


It - t Q | < 2 At 


It - t | s 2 At 
o 


where At is some scale factor. This smoothing function may be passed 
over the data as many times as one desires, more smoothing being 
accomplished with each pass and more of the high frequency components 
being suppressed. 

If the data is in digital form the smoothing function above 
takes the form such that if f'(t) is the smoothed data, 

q'tt) - | q( t) - £ rq.^t) ♦ f i+ 1 (t>] . da) 


Holloway (1958) showed that for n successive passes of the above 

elementary filter function through the data, the power transfer func- 
I |2 

tion |S| becomes 



2k 


|S j 2 = cos ^(rrf At ) , 

where At is the data sample spacing. 

The above filter function may be used to effectively remove 
the low-frequency component in the present example. If the smoothed 
data (low-pass filtered) is subtracted from the original unsmoothed 
data, the difference will represent the original data filtered by a 
high-pass filter with a power transfer function equal to the comple- 
ment of the original transfer function. Suppose the original data 
is filtered using this method. The smoothing function applied to the 
original data set f(t) will have a power transfer function similar to 
Figure 7. 



When this smoothed data is subtracted from the original f(t), 
the resultant power transfer function will be the complement of the 


function shown above. 
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Figure 8 


CO 


Applied to our example of data with a large low-frequency 
component, this will effectively flatten the curve to minimize the 
possibility of side lobe leakage. After the PSD has been computed 
post -darkening is achieved by multiplying by the inverse of the func- 
tion shown in Figure 8 to complete the calculation. 

For a more complete discussion of the topic of pre-whitening 
and digital filtering techniques see Blackman and Tukey ( 1958 ) and 
Enochson and Otnes (1968). 
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IV. DIGITIZED DATA 


The discussion of* power spectra has been general up to this 
point, with only an occasional reference to specifics. Since we are 
here primarily interested in data that appears in discrete, digital 
form, it is appropriate to specialize to that case. We shall from 
this point forward consider only the power spectra of data consisting 
discrete, ecjui-spaced samples. The discrete forms of the general 
equations of section II will be given for both the MLP and FFT cal- 
culations. The statistical reliability of PSD estimates will be 
discussed briefly for each of the two methods, and several considera- 
tions helpful for planning will be mentioned. The averaging of 
several power spectra is also mentioned. 

A, Discrete Forms of Relevant Equations 
1- MLP Method 

The steps required for calculation of* a, power spectrum using 
the MLP method may be summarized from the above discussion to be the 
following : 

(!) Pre-whitening . If the power spectrum is known to contain 
large peaks or discontinuities, the raw data should be pre-whitened 
by use of appropriate digital filters (high-pass, low-pass, band-pass, 
or combinations of these). The necessity for this procedure is 
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somewhat open to interpretation according to one’s perception of 
what constitutes a large peak or discontinuity. My own limited 
experience over several years is that for a power spectrum with a 
dynamic range of > 2—3 orders of magnitude , pre-whitening to 
flatten the spectrum prior to its computation is advisable. The 
example given in section III of a method of constructing a high- 
pass filter was used successfully by the author (M.S. thesis , 1973). 
Since, however, the actual method used to pre-whiten will depend 
on the details of the various power spectra encountered in practice, 
the reader is referred to chapter 3 of Enochson and Otnes ( 1968 ) 
for a thorough discussion of recursive, non-recursive, and second- 
and higher-order filters and their application to time series. 

(2) Normalizing the Data . Although not mentioned in the 
previous sections, it is advisable in practice to normalize the data 
to zero mean and unit standard deviation before calculation of the 
PSD. Since the calculation of the mean-lagged-products (autocorrela- 
tion function) involves the sum of many products, it is easy to termi- 
nate a computer calculation of a PSD prematurely due to overflow. 
Subtracting the mean from the data removes only the zero-frequency 
cosine term from the PSD, and can be included in the calculation after 
the remaining following steps are completed. Dividing the (pre- 
whitened) zero-mean data set by o reduces the data to unit standard 

deviation. Correct absolute units can be restored to the final com- 

/- 

puted PSD if desired by multiplying by a . 
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(3) Calculation of the autocorrelation function. If cp. 
0 

denotes the j'th value of the autocorrelation function, the discrete 
form for the autocorrelation function of the data set f(t) with data 
sample spacing At containing n discrete values f^ is 


cp 


1 

j “ n - j Z» Q f i f i+d 


J — 0, 1, 2, ... m , (l3) 


where m is the maximum number of lags in the autocorrelation function. 
As mentioned in section II, m will generally be limited to 10 — 20 % 
that of n. 

(4) Fourier transform of the autocorrelation function . Since 
the autocorrelation function is an even function, the discrete form 
for the Fourier transform becomes the discrete finite cosine series. 

Applying this to the sequence cp^, , we obtain as raw 

estimates for the PSD 


P. 

i 


At 




m-1 

+ = E ">, 

J=1 0 



+ ro cos in 

m 


(l4) 


i = 0, 1, ... m 


where At is the sample spacing. 
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(5) Smoothing the raw estimates . The raw estimates calculated, 
in step (4) correspond to Eq. (lO) 

P ap (c>) = [p true U) * |GU)! 2 ] * Q , 

with a box-car lag window leading to a sine function spectral window. 
What remains is convolution with a suitably tailored spectral window 
Q with small side lobes to complete the calculation. A commonly 
used window is the hamming window, the digital form of which becomes 

P i = °‘ 5k P i + °' 23 (P i+ i + P i-i ) • (15) 

(6) Post -darkening . If the raw data were pre-whitened in 
step (l), the last step in the calculation of the PSD will be to 
restore the true shape of the power spectrum by post -darkening. This 
process involves multiplying the smoothed estimates of step (5) by 
the inverse of the pre-whitening filter power transfer function. 

The frequency resolution Af of the resultant spectrum will be 


2m At 


where 


m = the maximum number of lags in the autocorrelation function 
and 

At = data set, sample period. 
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The maximum frequency of the computed PSD will be 


f 

max 


= mAf = 


1 

2At 


(16) 


in accordance with the sampling theorem. 

2. FFT Method 

The fast -Fourier- transform technique of computing the PSD of 
a time series was a major improvement over the MLP method in terms of 
the actual computer time taken for the calculation. If N is the num- 
ber of points in a time series, the total computer time needed for 
an MLP calculation is roughly proportional to N 2 , where for the FFT 
method it goes approximately as N(log N). The advantages of utilizing 
the FFT method whenever many points are being power spectrum analyzed 
therefore lie on the side of efficiency rather than any fundamental 
superiority of the method over that of the MLP. As a technique of 
computing PSD’s, it is quickly supplanting the MLP method and is 
therefore worth studying. An extensive discussion of the FFT com- 
puted one-dimensional PSD by Brault and White (l97l) provides a 
thorough introduction to FFT methods and their application to astro- 
nomical problems. 

Although no attempt will be made here to outline in detail the 
steps required for FFT calculation of a PSD, the basic steps are as 


follows : 
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(l) Pre -whi tening . The necessity for pre-whitening the data 
is less strong when the FFT method is used than it is for the MLP 

method. The reason for this lies in the fact that the FFT calculated 
PSD takes the form 

V a ' ) ' P tru** l G <»>l 2 > 

whereas the MLP method yielded 

V*) = [P true * | 2] * <*(«>) • 

The FFT form does not involve a convolution with Q, so that 
one need only he concerned with the shape of G(u>). If g(t) is 
properly specified the side lobes of |g( u ,)| 2 can be kept small enough 
so that very little leakage is present [see step ( 3 ) below]. Only 
in the case where there are exceptionally large peaks or discontin- 
uities in the power spectrum should pre-whitening be necessary. In 
general, it may be said that in most cases this step will not be 
necessary at all. 

^ Normalizing the data. It is advisable to normalize the 
data to zero-mean and unit standard deviation, for the same reasons 
as given for the MLP method above. 

^ Data windo w correction . As discussed in section III-A, 
truncation of a data set can produce spurious detail in the computed 
PSD. It is therefore necessary to choose a data window g(t) by which 
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to multiply the time series such that the frequency window G(cu) will 
have small side lobes. A window currently in common use is the ten- 
percent cosine bell, the application of which is called coswinding . 

For the generalized formula for this data window see Brault and White 
(1971, Equation 13). 

(b) Appending zeros to the data . The most efficient versions 
of the FFT require the data to consist of a number of points equal 
to a power of two, although some generalized versions will work with 
an arbitrary number of points. If a version is used requiring some 
specific number of points, the original normalized data set to be 
transformed must have zeros appended to it to bring the total points 
up to the specified number. 

(5) Transforming the data . The normalized data set f^ is 
transformed using the forward discrete Fourier series transform to 
yield Fourier coefficients 


n -1 
o 


a j + lb j 


At ^ ,| TT 

- £ 


j — 0 , 1, 


(17) 


where 

At = data sample spacing , 

n Q = number of data points in the normalized data set, and 
i = /T 

(6) Computing the raw spectrum . The raw PSD is just the ab- 
solute value squared of the Fourier transform, adjusted to compensate 
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for the appendage of zeros to the original data set. 


P. = 
1 


n o / 2 

IT (a i + 


»i> 


(18) 


where n - number of points in the original data set. 

(7) Smoothing the raw estimates . The raw estimates obtained 
in step (6), although theoretically correct, may be statistically 
unreliable. It is possible to increase the statistical reliability by 
smoothing (convolving) the estimates with a suitable smoothing func- 
tion. For an excellent discussion on the subject of smoothing raw FFT 
estimates see Edmonds and Webb (1972). 

The frequency resolution (PSD estimate spacing) of the FFT 
calculated PSD will be 


Af - 


1 

n At 
o 


(19) 


For specifics on the programming of the FFT algorithm, including 
Fortran indexing peculiarities, one should consult ‘Special Issue 
on FFT and Its Applications to Digital Filtering and Spectral Analysis', 
IEEE Trans. No. 2 ( 1967 ). Another special issue on the same 

topic, IEEE Trans. AU--17, No. 2 ( 1969 ), gives an extensive biblio- 
graphy. 
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^ — — e Statistical R eliability of the PSD Estimates 

\ 

1. Confidence Limit q 

The calculation of the PSD from a finite number of discrete 
points can be ejqjedted to have associate'^ with it a set of confi- 
enee limits reflecting the presumed statistical nature of the ori- 
ginal time series. theory, 'if a PSD calculation could be made from 
an infinite number off data points, one would have absolute confidence 
in the results. In practice one must settle for a f finite number of 
points from which tf> make the Calculation. Each poi ni in the resul- 
tant power spectrum will have a set of "confidence limits" assigned 
to it reflecting the confidence (jin a statistical sense j\that the re- 
sulting power/ spectrum is not due^ solely to randomly distributed 
data points./ j '■ \ 

j h 

The statistical accuracy of the* computed PSD is estimated in 

terms of equivalent degrees of freedom", from which the confidence 
limits ejj;e calculated. The degrees of freedom may be thought of as 
representing the number of estimates ojf power in the frequency inter- 
val Af, the frequency resolution of the computed PSD. A measure of 
the^rohability that an estimate falls] within an upper and . lower 

■und, the ratio of which is designate!* the confidence factor f , is 
given by ' \ 


f = 10 


.b/ (lO/k^l) 


= exp 

\io/rr^} 


( 20 ) 


) 
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where 

f = confidence factor for a given confidence , 
k = degrees of freedom,, and 

b = factor depending on the desired confidence, given as 
follows : 


b 


confidence 

50$ 80 $ 90$ 96$ 98$ 

8 16 20 25 29 


This is an approximation, but for k ^ 4 is very close to more exact 
calculations based upon chi-square tables (Edmonds, 1966). 

The confidence factor f^ may be used as error bars for the 
computed PSD, being positioned on each data point in such a way 
that the point falls on the geometric mean of the upper and lower 
limits of f c , Another way of stating this is that the upper limit 
for the confidence limits is equal to the PSD estimate times the 
square root of f ; the lower limit equals the estimate divided by the 
square root of f . If the PSD is plotted on a semi-log scale the 
confidence limits will be centered on each point. 

A plot of f as a function of degrees of freedom k for several 
different confidences is shown in Figure 9. The ordinate expresses 
f in Db; to determine f one goes across to the straight line labeled 
"factor", then reads f off the abscissa. As an example, for k = 30 
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the 90$ confidence factor expressed in Db is seen to be « 3-8, for 

which f « 2.3. 
c 

The next two sections will consider formulas for determining 
k for each of the two methods of computing PSDs. 

2. k for the MLP Calculation 

The number of degrees of freedom k for the one-dimensional MLP 
calculated PSD is given by (Blackman and Tukey, 1958) 


k = 



1 


3 


where 

N = the number of data points in the data set, and 

M = the maximum number of lags in the autocorrelation function 
= the number of points in the computed PSD. 

As can be seen from this formula, k is nearly proportional to 
the ratio N/M. In order to achieve maximum confidence in the PSD 
estimates it is necessary that k be made as large as possible; this 
will insure that the confidence factor f [Eq. (20)] will be small. 

N is usually some fixed number of points, so M is therefore chosen to 
be small relative to N. It is for this reason that M is normally 
chosen to be not more than 10 — 20$ of N. 



37 


3- k for the FFT Calculation 

The number of degrees of freedom k for the one-dimensional 
FFT calculated PSD is given by (Tukey, 1967) 


k = 2p 


fN - ^ 
“N 


where 

N = the number of points in the original data set, 

W T = the number of points tapered by the data window g(t), 

N q = the total number of data points after zeros have been 
appended, and 

p = the effective number of points involved in smoothing of 
the raw PSD estimates. 

Although the MLP and FFT formulas for k appear to be super- 
ficially different, it is not difficult to show that the confidence 
limits calculated from a given k are essentially equivalent for the 
two cases, as would be expected. 

C. Planning Considerations 

In the previous sections the basis of a PSD calculation was 
discussed with appropriate equations given. In a practical applica- 
tion some thought must be given to the problem of choosing data s am ple 
spacing, frequency resolution, and the frequency range over which the 
power spectrum is to be calculated. This section will deal only with 



38 


considerations to be made in performing an MLP calculation; similar 
considerations will apply to an FFT calculation. 

1. Choosing At and f 

B max 

The raw data from which the PSD is to be calculated is assumed 

to consist of discrete, equi-spaced samples of period At. Then if 

f denotes the maximum frequency for which PSD estimates are ob- 
max 

tained we have from the sampling theorem 

f = ~~ 
max 2 At 

fjiax is clearly independent of the total number of data points and 
the maximum lag M of the autocorrelation function. It depends solely 
on the sample spacing. If in an analysis one wishes to examine the 
PSD up through a specific frequency, this formula shows what the 
corresponding minimum sampling frequency must be. Likewise, if it is 
desired only to compute the PSD for a limited frequency range, by 
choosing the correct At the resulting PSD will have a frequency range 
of exactly the right size. This fact is helpful in minimizing the 
number of data points reauired in a calculation. For example, if one 
were interested in examining the PSD of a set of data only in the fre- 
quency range 0 — 5 Hz, the sampling period At would be 0.1 sec. Samp- 
ling more often than this would increase f^ y beyond the range of 
interest, increase the total number of data points, and thus the 
machine calculation time, and generally would not add any information. 
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If the sampled data already exists for which the power spectrum 
is to be calculated, it is possible to adjust At to near the desired 
value by (a) decimation, using only every p’th point of the original 
data or (b) averaging, averaging by groups of p points. The former 
method increases At by a factor of p and maintains the standard devia- 
tion (for random or near-random data) about the mean. The latter 
increases At by a factor of p but decreases the standard deviation 
about the mean by a factor of /p. For further discussion on these 
methods consult Enochson and Otnes (1968). 

2. Choosing M and Af 

The frequency resolution Af of the calculated PSD is 



where M is the maximum lag in the autocorrelation function. This 
formula may be used for either determining in advance of a calculation 
what the frequency resolution will be, or for determining M for a 
given desired Af. As discussed in section B, M should be small 
relative to the total amount of data. This fact must also be taken 
into account when planning a PSD calculation. 

As an example of how the foregoing considerations may be used, 
suppose we wish to calculate the PSD of a data set for the frequency 



4o 


range 0 — 5 Hz. Suppose further that we wish the frequency resolution 
Af to he 0.2 Hz, providing a total of 2 6 points (including the end- 
points) in the power spectrum, and that the 9 0$ confidence factor is 
to be < 2.0. Then 

At = j^ry =0.1 sec , 

= the maximum lag of 2 6 data points. 

From Figure 9 we find that for the desired confidence we must 
have k > 45, from which 

T > -g- = — g — , = 61 sec 

for which 

N = 6l0 data points . 


As a second example suppose a data set already exists consist- 
ing of 500 points with sample spacing 1 sec. If the autocorrelation 
function is truncated at 10$ the length of the data set, we calculate 

f , Af, k, and f (90$) to be: 
max’ * * c 
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f 

max 


1 

2At 


0.5 Hz 


> 


Af 


_1_ = 1 
2M 2(50) 


0.01 Hz 


y 



19.3 


, and 


f, (90%) =2.9 


From the above simple examples it is seen that M, N, At, f“ c> 

Af, and f are to a degree interrelated. Therefore one must be some- 

IQclX 

•what judicious in their specification in order to achieve the maximum 
amount of usable information in a PSD calculation. For example, one 
can achieve a very high degree of statistical reliability by making. 

M very small. However this would be at the expense of the frequency 
resolution. In general, for a given data set there is a trade-off 
between statistical reliability and frequency resolution. This prob- 
lem can be surmounted by simply increasing N (using more data) but 
then the computing time increases, and so on. 

D. Averaging Power Spectra 

It is occasionally desired to study the average characteristics 
of power spectra over long periods of time. Although it is possible 
to compute an average power spectrum over a long data set, it is 
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sometimes more efficient to break it up into shorter sets, compute the 
PSD for each individual data set, then average the results. This pro- 
cedure is also necessary if, for example, several short, non- contiguous 
data sets of varying lengths exist for which one wishes to extract an 
average power spectrum. The validity of the averaged results will 
depend on the stationarity of the data; for data which is non- 
stationary the results will be influenced by the lengths and number of 
data sets used in the calculation. [For a short discussion on this 
point see Sentman (1973).] The following sections apply to MLP cal- 
culated PSD's, though similar considerations would apply to FFT cal- 
culated PSD's. 

1. Averaged Power Spectra from Data Sets of Equal Lengths 

If it Is assumed that the frequency resolution Af is identical 
for each PSD comprising the average, the statistical reliability of 
each is the same. Then 


P. 

1 




where 

P = i'th point in the averaged power spectrum, 
P. . = i'th point in the j’th power spectrum, 

IT = the number of spectra being averaged, and 
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o 


S 


= standard deviation about the mean of data in the original 

j'th (pre -whitened) data set. If the data in the j 'th 

data set were divided by a after the mean was removed 

prior to calculation of the power spectrum, this weight 

is necessary to restore the proper relative units to 

P. .. If the data were not normalized this weight is 
1 3 

equal to one. 


2. Averaged Power Spectra from Data Sets of Different Lengths 
If the individual spectra comprising the average are confuted 
from data sets of different lengths , the statistical reliability of 
each PSD is different. If Af is identical for each individual PSD, 
the expression for averaging becomes 


P, = 


l 

j=l ° 


exp 


‘ - 2.3 b 

20/k . - ] 
d 


ij 


where 

k. = degrees of freedom in the j'th computed power spectrum, 

J 

and 

b = 8, for 5C$ confidence limits* 

The exponential weighting factor must be included to properly 
weigh individual spectra according to its probable error. The uncer- 
tainty in the spectra is taken to be the probable error, or the 
square root of the 50 % confidence factor defined in Eq, (20). 
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3. Statistical Reliability of Averaged Spectra 
Confidence limits for the averaged spectrum are computed by 
assuming that its equivalent degrees of freedom k equal the sum of 
the degrees of freedom in the individual spectra comprising the aver- 
age. 


N 

e- E 

3-1 


It can easily be shown that if the statistical "noise" present in each 
of the individual spectra is treated as a random fluctuation about 
the true power spectrum, the above expression for calculating k results 
in confidence limits that shrink at exactly the same rate as the 
"noise" when more and more spectra are averaged. 



V. AW EXAMPLE OF A POWER SPECTRUM CALCULATION 


As an example of how the information contained in the previous 
sections may be applied, the following will serve to illustrate the 
basic features of an MLP calculated power spectrum. The example 
given is from a calculation made in conjunction with the study of 
oscillatory phenomena in the solar atmosphere (Sentman, 1973). 

The raw data consisted of antenna temperature 5 sec data 
samples of solar microwave emission recorded on the North Liberty 
Radio Observatory 2-cm radiometer. Individual data sets ranged in 
length from 4 — 12 h. The power spectrum of each data set was to be 
calculated, and the average of all the individual spectra computed 
to obtain an average power spectrum for all the data sets. 

- The frequency range of interest to this study was 0 — 15 mHz 

_ X 

(l mHz - 10 Hz), or f =15 mHz. A sampling time At of 
l/2(l5 X 10 ) = 33-3 sec was thus indicated. The nearest that one 

could come to this figure using 5 sec data sanples was either 30 sec 
or 35 sec, corresponding to averaging the 5 sec samples by groups of 
6 or 7, respectively. Averaging by groups of 6 was chosen, yielding 
an effective sampling time At = 30 sec and a maximum frequency 
^max ~ l6»7 mHz. A typical data set with 30 sec resolution (effective 
sampling time) is shown in Figure 10. 
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It was known in advance of the calculation (by trial runs on 
several data sets) that the power spectra all contained a very large 
peak at near zero frequencies, thus indicating a need for 
pre-whitening. High -pass smoothing was achieved by means of the 
method described in section III-B by making 10 passes through the 
data with the smoothing function described by Eq. (12). The result- 
ing low- and high-pass power transfer functions are shown in Figure 
11. A maximum lag of 25 rain (= 50 lags X 30 sec) was chosen to 
balance frequency resolution against statistical reliability. This 
resulted in a frequency resolution Af = 0.33 mHz. The degrees of 
freedom k therefore ranged from 18.5 (4 h data set) to 56.9 (l2 h 
data set), providing the high degree of statistical reliability 
necessary for the study (very low amplitude fluctuations were being 
sought). 

The data normalization and power spectra calculations were 
achieved using the subroutine listed in Appendix I. After the power 
spectrum was calculated it was post-darkened to compensate for the 
pre-whitening by multiplying with the inverse of the high-pass power 
transfer function shown in Figure 11. An example of the resulting 
spectra with confidence limits, normalized to a value of 10 in the 
range of 4—5 mHz for display purposes, is shown in Figure 12 for 
the frequency range 0 — 15 mHz. All calculations were carried out on 
a Univac 4l8 computer. 
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FIGURE CAPTIONS 

Figure 9 The confidence factor as a function of degrees of freedom 
k (see page 35). 

Figure 10 Typical plot of antenna temperature versus time with 30 
sec time resolution. Data records used to compute power 
spectra were chosen to exclude the end pieces of each data 
day. 

Figure 11 Low-pass and high-pass filter functions R(f) and R'(f) 
used to pre-whiten the data prior to calculation of the 
power spectra. Post -darkening is achieved by multiplying 
the resultant spectra by l/R / (f). 

Figure 12 Power spectrum of the single record containing a statis- 
tically significant peak near b mHz. 
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APPENDIX I 

A FORTRAN SUBROUTINE FOR CALCULATING POWER SPECTRA 

The following is an example of a subroutine for an MLP calcu- 
lation of a power spectrum. Pre-whitening is assumed to have been 
completed prior to entry into the subroutine, and post -darkening and 
conversion to absolute units from normalized units is assumed to 
take place after return. 

The program was written from the definitions for the auto- 
correlation function and discrete cosine transform. However, to 
facilitate the transform operation a cosine table is constructed and 
a table look-up procedure is incorporated into the program rather 
than requiring a double precision cosine to be computed each time 
it is needed. The table is constructed in the first call of the 
subroutine and each time the autocorrelation function maximum lag is 
changed; succeeding calls use the table already constructed. 

As shown below, the program will accommodate a maximum of 1500 
data points with an autocorrelation function maximum lag of 500, 

These figures may be raised or lowered as desired by dimensioning the 
relevant arrays accordingly. 
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A. Subroutine Arguments 

Input variables: 

X - Double precision array containing data to be power 
spectrum analyzed 

N - Integer specifying the number of data points in X (may 
be smaller than the dimension size of X) 

MA - Integer specifying the maximum number of lags for the 

autocorrelation function (may be smaller than the dimen- 
sion size of R) 

Output variables: 

U - Single precision array containing M + 1 hamming smoothed 
normalized power spectral estimates 

R - Double precision array containing M + 1 values of the 
normalized autocorrelation coefficients 

XBAR - Double precision variable for the mean of the input 
data set 

SX - Double precision variable for the standard deviation 
of the input data set. This variable is necessary if 
absolute units are to be calculated for the power spec- 
trum (not done here) 
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B„ Subroutine 


SUBROUTINE SPECTR(X,N,MA,U,R,XBAR,SX) 

DOUBLE PRECISION X,AL,CS,XBAR,SX,R 

DOUBLE PRECISION FAC,ARG,DCOS,DBLE,DSQRT,PI,AEND 

DIMENSION X(1500),U(501),R(501),CS(501),AL(501) 

DATA Pi/ . 31415 92654D+01/ ,M/ 0/ 

NORMALIZE DATA TO ZERO MEAN AND UNIT STANDARD DEVIATION 
XBAR=0.0 
DO 3 1=1, N 
3 XBAR=XBAR+X ( I ) 

XBAR=XBAR/DBLE (FLOAT (N ) ) 

SX=0.0D+00 
DO 6 1=1, N 
X(l)=X(l) -XBAR 

6 SX=SX+X(l)*X(l) 

SX=DSQRT (SX/DBLE (FLOAT (N ) ) ) 

DO 7 1=1, N 

7 X(I)=X(I)/SX 

BUILD COSINE TABLE ,CS. FOR INTERVAL .0. to .PI. 
IF(MA.EQ.M) GO TO 200 
M=MA 
MP^I+1 

AVERG=DBLE (FLOAT (M ) ) 

DO 2 1=1, MP 
ARG=DBLE (FLOAT (i-l ) ) 

ARG=(PI*ARG)/AVERG 
2 CS(l)=DC0S (ARG) 

200 CONTINUE 

CALCULATE AUTOCORRELATION COEFFICIENTS 
DO 30 J=1,MP 
R(j)=0.0D+00 
JP=J-1 
IEND=N- JP 

AEND=DBLE (FLOAT ( IEND ) ) 

DO 25 1=1, IEND 
INDX=I+JP 

25 R ( J )=R ( J )+X ( I )*X (INDX ) 

30 r(j)=r(j)/aend 
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calculate power spectrum 

(FOURIER TRANSFORM A/C FUNCTION .R.) 

MX2=M*2 
MFX2=MX2+2 
MM=M-1 
DO 4o J=1,MP 

al(j)=o.od+oo 

JP=J-1 
DO 55 K=1,MM 

CALCULATE INDEX .INDCS. FOR RETRIEVING 
PROPER COSINE FRCM TABLE 
INDCS^OD ( ( JP*K ) ,MX2 )+l 
IF (INDCS. GT.MP) INDCS=MPX2- INDCS 
55 AL ( J )=AL ( j)+2 . 0D+0O*R (K+l )*CS (INDCS ) 

FAC=L. OD+OO 

IF(MOD(jP,2).EQ.l) FAC=-1.0D+00 
40 AL(j)=AL(j)+R(l)+R(MP)*FAC 

APPLY HAMMING SMOOTHING FUNCTION TO ESTIMATES .AL. 
TO YIELD SMOOTHED POWER SPECTRUM ESTIMATES .U. 

U ( I )=SNGL ( 0 . 5^D+00*AL (l ) +0 . 46D+00*AL ( 2 ) ) 

U (MP )=SNGL (0. 54D+00*AL (MP )+0 . 46d+oo*al (M ) ) 

DO 50 1=2 ,M 

50 U (I )=SNGL ( 0 . 54D+00*AL ( I )+0 . 25D+00* (AL ( I-I )+AL ( 1+1 ) ) ) 
RETURN 
END 
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